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1.0  INTRODUCTION 


V 


Tasfc  allocation,  the  assignment  of  tasks  to  processors,  is  an  important  problem  m  the 
design  of  distributed  real-time  systems.  A  task  allocation  scheme  is  required  in  order 
to  produce  a  feasible  partition  of  tasks  across  processors  in  the  system,  and  to  ensure 
high  performance,  especially  for  systems  with  real-time  operational  requirements. 
Several  researchers  have  studied  the  task  allocation  problem  for  distributed  systems; 
[ChuSOl  contains  a  survey  of  various  approaches 


One  of  the  problems  of  current  interest  in  real-time  system  design  is  the  development 
of  real-time  Ada  software  for  distributed  systems^)  Several  approaches  have  been 
proposed  and  are  being  studied;  a  survey  can  be  found  in  [Arm84],  The  approaches 


can  be  characterized  as  either 
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source  code  allocation,  the  development  of  multitasking  Ada  software,  which  is 


then  partitioned  ;  this  approach  allows  development  and  testing  of  the  software 

A- 

as  a  whole  before  allocation^ 


*  target-code  allocation,  where  a  compiler  is  responsible  for  performing  task 
allocation,  perhaps  with  some  user-imposed  contraints;  or 


•  separate  program  development,  where  allocation  decisions  are  made  early  in 
the  development  phase,  and  separate  programs  are  developed 


The  traditional  approach  of  developing  separate  source  programs  for  each  processor 
in  the  distributed  system  requires  the  system  designer  to  make  early  decisions  on 
allocation,  taking  into  account  resource  and  performance  constraints.  This  approach, 
however,  increases  the  difficulty  of  software  reallocation  in  later  phases  of  the 
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software  life  cycle.  Target  code  allocation  schemes  require  a  distributed  target 
compiler  that  is  used  to  generate  separate  object  code  files  in  allocating  target  code 
for  each  processor.  There  are  two  ways  that  the  compiler  can  partition  Ada 
application  software:  (1)  being  informed  via  pragmas  about  a  predetermined  partition 
scheme:  or  (2)  analyzing  the  application  software,  and  then  applying  a  partitioning 
algorithm.  The  compiler  required  for  this  allocation  scheme  is  complex,  difficult  to 
design,  and  presently  not  available. 

~S> 

The  source  code  allocation  approach  has  a  number  of  important  advantages  In  an 
allocation  scheme,  it  is  preferable  to  place  minimum  design  restrictions  on  software 
development,  especially  since  the  target  architecture  may.  in  most  cases,  not  be 
known.  It  is  also  preferable  to  minimize  the  burden  on  the  compiler.  Additionally, 
since  the  underlying  system  constraints  may  vary,  a  good  partition  can  be  achieved 
only  through  iteration,  and  therefore,  creating  new  partitions  should  be  inexpensive. 

A  source  code  allocation  approach  that  meets  the  above  objectives  was  adopted  by 
GTE  Strategic  Systems  Division  in  its  multicomputer  software  technology  for  Ada.  This 
approach  allows  an  application  to  be  developed  and  tested  as  a  single  multitasking 
Ada  program  on  the  APSE  (Ada  Programming  Support  Environment),  and  then 
partitions  and  distributes  the  tested  software  to  the  distributed  targets.  Program 
partitioning  is  done  at  the  source  level,  and  the  distributed  software  modules  are 
compiled  on  the  target  machines. 

GTE  Laboratories  researched  the  development  of  a  methodology  for  partitioning  Ada 
source  code  to  execute  in  a  distributed  environment.  Two  major  tasks  were  involved 
m  the  development  of  such  a  methodology: 

1.  the  formulation  and  selection  of  parameters  that  can  be  derived  from  the  Ada 
source  code,  to  be  used  in  the  partitioning  process;  and 


2.  the  development  of  an  efficient  partitioning  algorithm. 


The  approach  taken  closely  follows  that  described  in  a  previous  report  (GTE85)  The 
basic  goal  of  the  approach  is  to  transform  a  given  Ada  program  into  a  graph  based 
representation,  and  then  to  apply  a  partitioning  algorithm  for  task  allocation  The 
graph  based  representation  is  similar  to  the  specification  schemes  being  investigated 
at  the  University  of  Texas  at  Austin  (Mok84.  Mok85). 

In  this  paper,  we  describe  the  research  efforts  towards  achieving  the  above  tasks  In 
section  2.  we  describe  the  partition  problem  and  describe  an  algorithm  for  it.  In 
section  3.  we  define  partitionable  units  in  Ada  software.  In  section  4,  we  enumerate 
parameters  derivable  from  Ada  source  code  to  be  used  in  partitioning.  In  section  5. 
we  present  an  Ada  example.  Section  6  concludes  the  report  with  a  suggestion  for  a 
future  research  direction. 
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2.0  PARTITIONING  PROBLEM  ANO  ALGORITHMS 

The  problem  of  allocating  Ada  source  over  distributed  targets  can  be  formulated  as  a 
graph  partitioning  problem.  For  purposes  of  task  allocation,  an  Ada  program  can  be 
represented  by  a  graph  G  =  ( V.E)  as  follows: 

•  the  vertices  of  the  graph  G.  i.e.,  the  set  V,  represents  the  partitionable  Ada 
program  units;  and 

•  the  communication  or  dependency  between  units  is  represented  by  the  edge  set 
E. 

Given  this  representation  of  the  Ada  program,  weights  are  assigned  to  the  vertices 
and  edges;  a  weight  w(v)  for  vertex  v  €  V  represents  execution  characteristics  of  the 
Ada  partitionable  unit,  obtained  from  such  parameters  as  computation  ,  memory  and 
similar  resource  requirements.  The  weight  wfe^)  assigned  to  the  edge  between  unit 
(vertex)  v(  and  unit  (vertex)  v(  represents  the  total  communication  cost  between  the 
two  units;  this  weight  is  obtained  from  such  parameters  as  the  number  of  data 
elements  transferred  in  a  communication,  and  the  number  of  messages  required  per 
transaction 

The  partitioning  problem  for  graph  G  =  (V,E)  can  be  formulated  as  follows:  determine  a 
partition  of  V  into  m  disjoint  subsets  Vi.  Vi . Vm  such  that 

(1)  J  i  1  w( v)  £  K,  1  £  i  £  m,  for  some  constants  J  and  K 

V€Vj 

and 

(2)  l(V(,Vj)  =  X  w(e)  £  L,  for  some  constant  L,  E  Q  E  and 

eeEjj 

(v'.v")  €  E(j  =>  v'  €  Vj,  v'  €  V(  and  V,  *  VJ. 
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The  partitioning  problem,  as  formulated,  aims  at  reducing  the  communication  cost 


l(V  ,  V  )  between  the  partitioned  clusters,  and  places  a  load  balancing  constraint  on 
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the  clusters.  These  objectives  are  appropriate  for  the  Ada  partitioning  problem  as 
system  performance  is  affected  by  factors  such  as  interprocessor  communication 
delays,  processor  load,  and  the  amount  of  parallelism  that  can  be  exploited. 


The  partitioning  problem  as  formulated  has  been  shown  to  be  NP-complete  [Gar79]; 
however,  there  are  partitioning  algorithms  that  use  heuristics  to  obtain  close  to 
optimal  results  with  acceptable  algorithm  performance  [Pri84,Ker69]. 


The  heuristic  techniques  reported  in  the  literature  can  be  classified  into  three 
categories  (Lin81  ]:  (1)  constructive  initial  assignment.  (2)  iterative  assignment- 

improvement.  and  (3)  branch  and  bound  technique. 


The  constructive  initial  assignment  techniques  are  based  on  the  concept  of  assigning 
one  unit  at  a  time  to  a  particular  processor  until  all  the  units  are  assigned.  Algorithms 
vary  on  the  order  in  which  the  units  are  assigned  and  the  criteria  used  to  select  the 
processor. 


The  iterative  assignment-improvement  techniques  stait  with  an  initial  assignment  of 
units  to  processors  and  generate  the  next  assignment  by  making  a  small  improvement 
to  the  initial  assignment  (Ker69].  The  algorithms  terminate  when  no  improvement  can 
be  discovered  or  after  a  predetermined  number  of  iterations 


The  branch  and  bound  methods  are  based  on  the  concept  of  doing  an  implicit  search 
of  a  decision  tree.  Algorithms  use  different  heuristic  methods  for  deciding  which 
branch  m  the  decision  tree  to  follow  and  for  pruning  possible  solutions. 


The  iterative  assignment-improvement  algorithms  are  used  more  than  the  other  two 
techniques.  In  general,  the  branch  and  bound  algorithms  are  too  stow  for  largo 
applications  and  the  constructive  initial  assignment  algorithms  do  not  generate 
partitions  that  are  as  good  as  the  other  two  techniques. 

It  is  expected  that  the  number  of  vertices  in  a  graph  obtained  from  Ada  source  will  be 
large,  and  therefore  heuristics  will  need  to  be  developed  not  only  for  creating  good 
partitions,  but  also  for  partitioning  in  an  acceptable  amount  of  time. 

In  the  remainder  of  this  section,  we  describe  an  algorithm  based  on  a  well-known 
graph  partitioning  algorithm  by  Kernighan  and  Lin  [Ker69],  The  algorithm  uses  the 
iterative  assignment-improvement  techniques. 

The  mam  idea  of  the  algorithm  is  to  start  with  an  initial  partitioning  into  m  subsets  and 
by  repeated  application  of  iterative-assignment-improvement  techniques  to  pairs  of 
subsets,  to  achieve  a  near-pairwise-optima!  state. 

To  obtain  the  initial  partition,  the  units  are  assigned,  one  by  one.  to  the  subset  with 
least  weight  first,  provided  the  balancing  inequalities  are  satisfied.  Next,  the 
interactive  assignment-improvement  algorithm  is  applied  to  every  pair  of  subsets  and 
it  works  as  follows: 

Let  A.  B  be  two  arbitrary  subsets,  the  algorithm  identifies  X  and  V.  subsets  of  A  and  B 
respectively,  such  that  interchanging  X  and  Y  produces  A*  ( =  A  -X  +Y)  and  B‘  (  =  B 
-Y  +  X)  that  satisfy  the  following  conditions: 

(1)  J  £  I  w(v)  £  K. 

veA’ 

J  £  I  w(v)  £  K.  and 

veB* 

(2)  I  (A*.  B*)  <  l( A,  B). 


The  algorithm  finds  X  and  V  by  sequentially  identifying  their  elements  without 
considering  all  possible  choices.  A  pair  of  units  (a.  b),  where  a  €  A  and  b  e  B.  are 
selected  to  yield  the  largest  possible  reduction  in  the  interprocessor  communication 
cost  from  a  single  interchange.  We  refer  to  this  reduction  in  the  communication  cost 
as  the  gain  from  interchanging  a  and  b.  and  we  denote  this  gam  g(a.b) 
Mathematically,  the  gain  g(a.b)  is 

(I  w((a.  y))  -  I  w((a,  x)))  +  (I  w((b.  x))  -  Z  w((b.  y)))  where 

y€B'  x€A'  x€A'  y€B' 

A'  =  A  -  {a}  and  B'  =  B  -  {b}. 

The  gam  g(x.y)  is  defined  to  be  zero  if  interchanging  x  and  y  upsets  the  balancing 
inequalities.  The  elements  a  and  b  are  interchanged  to  form  A>  ( =  A  -  {a}  +  {b})  and 
B(  (=  B  -  {b}  +  {a}).  The  elements  a  and  b  are  then  eliminated  from  further 
consideration  for  exchange.  The  exchange  procedure  continues  until  all  units  have 
been  exhausted,  or  there  is  no  further  possible  exchange  that  will  yield  a  positive 
gam. 


3.0  PARTITIONABLE  UNITS 


In  our  framework  of  source  code  partitioning,  we  discuss  what  constitutes  a 
partitionabie  unit  of  an  Ada  program.  Since  the  partitioned  software  has  to  be 
compiled  on  each  processor,  the  partitioned  units  must  be  separately  compilable  In 
Ada.  there  are  four  kinds  of  program  units  that  can  be  separately  compiled.  They  are 
tasks,  subprograms,  packages,  and  generic  units 

Subprograms  are  the  basic  executable  units  of  Ada  programs.  They  can  be 
procedures  or  functions.  A  subprogram  communicates  with  outside  entities  via  global 
declarations  or  parameter  passing  upon  its  invocation  and  termination. 

In  Ada.  a  collection  of  logically  related  entities  can  be  encapsulated  in  a  package  A 
package  allows  its  entities  to  communicate  with  an  entity  outside  the  package  via 
global  declaration  or  by  the  import  and  export  mechanism.  The  entities  declared  in 
the  visible  pan  of  the  package  specification  may  be  used  outside  the  package.  And 
entities  in  another  package  may  be  used  by  establishing  the  visibility  through  the  with 
clause. 

Unlike  subprograms  and  packages,  tasks  operate  in  parallel  with  other  program  units 
The  mam  program  unit  is  implicitly  considered  to  be  a  task.  In  Ada.  task  interaction  is 
handled  by  treating  each  task  as  a  communicating  sequential  process  [Hoa78]  The 
tasks  are  synchronized  in  time  when  they  communicate.  The  explicit  synchronization 
is  known  as  a  rendezvous.  Similar  to  package  specification,  a  task  specification 
defines  the  communication  entries  available  to  other  tasks. 

These  three  kinds  of  program  units  can  be  introduced  in  the  declarative  part  of  any 
unit.  This  makes  the  communication  among  units  non-tnvial  We  discuss  this  in  a 


later  section. 
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Some  distributed  Ada  systems  allow  partitioning  on  task  boundaries  or,>y  .  •  - 

approach  appears  to  have  achieved  a  synergy  between  Ada  s  units  o'  rs-  .  *  • 

the  underlying  system's  unit  of  concurrency,  the  processor  However  tr  s  a;  .  i 
requires  all  code  being  partitioned  to  be  encapsulated  by  a  task  This  r<.qw  s  > .  j-  . 
partitioning  to  make  sure  that  tasks  are  designed  at  the  appropriate  piace  T~  ;  cc-s 
not  meet  our  objective  of  making  minimum  design  restrictions.  In  some  app  ra'  c'  s 
limiting  interprocessor  interface  to  only  task  rendezvous  may  be  unnatj'a 
interprocessor  interface  may  be  better  represented  as  a  call  to  a  procedure  ns  oc  a 
package  and  not  a  call  to  an  entry  for  a  task. 

We  propose  that  partitioning  be  allowed  on  these  three  kinds  of  program  un  : 
boundaries.  We  do  not  explicitly  include  generic  units  as  partitionable  units  We  can 
view  generic  packages/subprograms  and  their  instantiations  as 
packages/subprograms  in  the  partitioning  scheme  Since  the  partitioned  units  have  to 
be  compiled  on  the  target  machines,  we  may  require  the  partitionable  units  to  be 
designed  as  Ada  compilation  units  for  distribution  purposes.  This  does  not  impose 
any  syntactic  restriction  or  any  design  restrictions  since  Ada  program  units  can  be 
submitted  as  separate  compilation  units  or  as  one  compilation.  However,  it  must  be 
understood  that  a  compilation  unit  does  not  have  to  be  a  partitionable  unit  In  Ada. 
each  compilation  unit  specifies  the  separate  compilation  of  a  construct  that  can  be  a 
subprogram  declaration  or  body,  a  package  declaration  or  body,  a  generic  declaration 
or  body,  or  a  generic  instantiation  A  compilation  unit  can  also  be  the  body  of  a  task 


4.0  PARTITION  MODEL 


We  have  discussed  the  general  partition  problem  and  our  proposed  partitionabie  units 
in  Ada  in  previous  sections.  We  now  propose  a  model  for  partitioning  an  Ada  program 
for  distributed  targets. 

The  first  step  in  our  modeling  is  to  represent  the  interunit  communication  as  a  graph 
G  =  (V, A),  where  V  is  the  set  of  verlices  representing  the  partitionabie  units  of  an 
application,  and  A  is  the  set  of  arcs  representing  the  communication.  Our  next  steps 
will  be  examining  how  to  assign  weights  to  the  vertices  and  the  arcs. 


4.1  INTERUNIT  COMMUNICATION 

A  unit  can  be  a  subprogram,  a  task,  or  a  package  of  data  objects.  There  are  four  kinds 
of  communication  among  these  units. 

The  first  is  subprogram  invocation.  A  subprogram's  execution  is  invoked  by  a 
subprogram  call  from  another  subprogram  or  a  task.  After  the  association  between 
formal  parameters  and  actual  parameters  is  established,  execution  control  is  passed 
to  the  called  subprogram.  Upon  completion,  control  is  returned  to  the  caller.  The 
subprogram  invocation  follows  a  single  thread  of  control.  The  communication  cost  is 
incurred  at  invocation  and  completion. 

The  second  kind  of  interunit  communication  is  task  rendezvous.  Different  tasks 
execute  independently,  except  when  they  communicate.  A  task  entry  can  be  called  by 
another  task.  Communication  is  established  when  the  called  task  accepts  the  call.  If 
the  entry  has  parameters,  values  are  communicated  between  th..  tasks.  After  this 
synchronization,  the  task  issuing  the  entry  call  and  the  task  accepting  the  call  continue 
their  execution  independently. 


Task  activationlterminalion  is  another  kind  of  communication  related  to  tasks  This  is 
an  implicit  communication  in  the  control  flow  of  the  task  dynamics  The  initial  part  of 
task  execution  is  called  activation.  A  task  is  activated  as  a  result  of  the  elaboration 
(execution)  of  the  declarative  part  of  its  parent  task  or  as  a  result  of  the  allocation  of  a 
new  task.  A  task  is  said  to  be  terminated  when  it  is  completed  (it  finishes  its  last 
executable  statement)  and  all  its  dependents  are  terminated.  Therefore,  upon 
termination,  a  dependent  task  needs  to  communicate  its  state  to  its  parent  This  kmct 
of  activation/termination  communication  occurs  between  a  task  and  any  other  kind  of 
partitionable  unit. 

Data  reference! modification  is  a  different  kind  of  interunit  communication  that  is  not 
explicit.  A  partitionable  unit  can  reference  any  visible  data  defined  in  other  units  A 
data  definition  in  a  unit  is  made  visible  to  another  unit  either  by  scope  rules  or  with 
clauses  Data  reference/modification  is  purely  data  flow;  there  is  no  control  flow 
involved 

Two  units  with  any  of  these  four  kinds  of  communication  will  experience  some 
network  delay  when  they  are  allocated  to  different  processors.  Over  the  same 
network,  different  kinds  of  communication  take  different  amounts  of  time.  We  discuss 
the  weights  on  these  kinds  of  communication  in  more  detail  below. 

Although  communication  is  bidirectional,  we  like  to  assign  direction  to  it  for  analysis 
purposes.  We  say  a  communication  is  from  unit  A  to  unit  B  if  unit  A  initiates  the 
communication. 

We  assign  a  weight  to  every  communication  from  unit  A  to  unit  B  if  they  are  assigned 
to  different  processors.  The  weight  of  an  interprocessor  communication  depends  on 
the  number  of  messages  required  for  such  a  communication.  In  the  case  of 
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subprogram  invocation  and  task  rendezvous,  the  number  of  arguments  m  the  call  has 
to  be  taken  into  account  for  the  weight  of  the  communication. 


For  a  call  to  a  subprogram  on  a  different  processor,  two  messages  are  needed:  a 
'call"  message  containing  the  IN  parameters,  if  there  are  any,  and  a  'return'  message 
containing  any  OUT  parameters: 


subprogram  call 


s.eali 

requesting  unit 

subprogram  id 

input  parameters 

subprogram  return 


s  return 


requesting  unit 


output  parameters 


We  assign  a  weight  of 

3  +  number  of  input  parameters 
to  a  subprogram  'call"  message  and  a  weight  of 
2  +  number  of  output  parameters 

to  a  subprogram  'return"  message.  Thus  a  subprogram  invocation  type  of 
communication  is  assigned  a  weight  of 

5  +  number  of  input  parameters  +  number  of  output  parameters. 

In  addition  to  the  'call"  and  "return'  messages,  there  are  two  more  messages  needed 
for  each  normal  task  rendezvous  [Wea84],  After  a  "call"  message  is  sent  to  the 
accepting  task  and  when  the  accepting  task  is  ready  to  accept  the  task,  it  returns  an 
"accept"  message  to  the  calling  task.  If  the  calling  task  still  desires  rendezvous,  it 
returns  a  "confirm"  message  to  the  accepting  task.  The  message  passing  required  for 
task  rendezvous  is  dipicted  in  figure  1.  The  "accept"  and  "confirm"  messages  are  less 
complex  than  the  "call"  and  "return"  messages  that  contain  IN  and  OUT  parameters: 


accept 


accept  requesting  unit 


confirm 


confirm  requesting  unit 


return 


return  requesting  unit  output  parameters 


Weights  are  assigned  to  each  of  these  four  messages  in  the  same  manner  as  for 
subprogram  invocation  messages.  The  weight  assigned  to  a  task  rendezvous  type  of 
communication  is 

9  +  number  of  input  parameters  +  number  of  output  parameters. 


In  the  event  of  elaboration,  a  task  sends  an  'elaborate'  message  to  all  its  dependent 
tasks  and  waits  for  an  'active*  message  from  each  of  them.  When  a  task  completes 
its  last  executable  statement,  it  waits  for  a  'terminate'  message  from  each  of  its 
dependents  before  it  terminates.  Therefore,  there  are  totally  three  messages  between 
a  parent  and  each  of  its  dependents  required  for  activation  and  termination: 

task  activation 

task  activation  return 

task  termination 

A  task  activation! termination  type  of  communication  is  thus  assigned  a  constant 
weight  of  7 

For  data  reference/modification  between  two  units  on  different  processors,  a  "request" 
message  is  sent  from  the  initiator  and  a  'response"  message  is  returned: 


write 

write  return 

A  data  reference! modification  type  of  communication  is  assigned  a  weight  of 
5  +  size  of  the  data  object. 

Knowing  the  weight  of  each  kind  of  communication,  the  weight  of  an  edge  from  unit  A 
to  unit  B  is  computed  as  the  sum  of  the  weights  of  all  communication  from  A  to  B. 

4.2  COMPUTATIONAL  COMPLEXITY  OF  THE  UNITS 

Several  program  complexity  metrics  have  been  developed  for  various  purposes  such 
as  maintainability  and  understandability.  For  partitioning  purposes,  we  are  interested 
in  the  computational  complexity  of  a  unit.  The  complexity  measure  includes  the  unit  s 
time  and  space  requirements.  Knowing  the  complexity  of  each  unit,  we  may  be  able 
to  achieve  better  load  balancing  for  the  processors  in  the  system.  We  use  a  simple 
definition  of  load  balancing.  Load  balancing  is  an  assignment  of  units  to  processors 
such  that  the  time  and  space  requirements  are  evenly  distributed  to  each  processor  in 
the  system. 
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A  unit,  except  a  package  of  data,  contains  a  code  portion  and  a  data  portion  a 
suitable  metric  for  measuring  the  space  requirements  of  the  code  portion  of  a  unit 
might  be  the  number  of  machine  instructions.  However,  the  number  of  machine 
instructions  generated  from  Ada  source  is  compiler  dependent  In  general,  the 
expansion  ratio  of  the  number  of  machine  instructions  per  line  achieved  by  a  compiler 
is  not  known.  GTE-GS  SSD  has  done  some  work  in  measuring  empirically  the 
expansion  ratio  of  a  group  of  compilers  [Che86].  Since  we  are  using  a  less  strict 
definition  of  load  balancing:  that  is.  we  are  not  aiming  at  an  optimal  assignment,  the 
number  of  source  lines  could  be  a  good  estimate  of  the  space  requirement  for  a  unit's 
code  portion.  Similar  to  code  space  requirements,  a  unit's  data  storage  requirement 
is  compiler  dependent.  At  this  point,  we  are  going  to  use  a  set  of  assumptions  about 
Ada  data  type  storage  requirements. 

The  time  complexity  of  a  task  or  subprogram  unit  that  does  not  contain  a  loop 
statement  can  be  measured  by  the  number  of  machine  instructions  generated  for  the 
unit  To  analyze  a  unit  with  loop  statements  is  nontrivial.  A  loop  statement  without  an 
iteration  scheme  can  be  repeatedly  executed  until  a  transfer  of  control  occurs  In 
most  cases,  the  number  of  times  a  loop  is  going  to  be  executed  cannot  be  predicted 
until  the  transfer  of  control  occurs.  Even  for  a  loop  statement  with  a  'while'  iteration 
scheme,  it  is  difficult  to  estimate  the  number  of  iterations.  We  can  only  be  certain  of 
the  number  of  iterations  in  a  loop  statement  with  a  'for'  iteration  scheme.  We  can 
view  all  loop  statements  with  or  without  iteration  schemes  as  a  sequence  of  code  to 
be  executed  periodically.  Therefore,  the  time  requirement  of  a  unit  can  be  estimated 
by  the  number  of  source  lines  without  regard  to  loop  statements. 


S.0  AN  EXAMPLE 


In  this  section  we  examine  an  Ada  embedded  system  for  monitoring  temperatures  as 
described  in  [Boo83],  In  the  appendix,  we  include  the  program  that  was  taken  in  its 
entirety  from  chapter  18  of  (Boo83].  We  formulate  the  partitioning  problem  and  apply 
the  partitioning  algorithm  described  in  section  3  to  it. 

The  Ada  program  consists  of  the  main  program,  four  major  tasks  and  a  number  of  10 
packages.  We  center  our  partitioning  problem  on  the  main  program  and  the  four 
tasks,  and  ignore  the  I/O  packages  to  simplify  the  discussion.  The  partitioning 
problem  we  are  addressing  is  as  follows: 

Find  a  partition  of  the  five  Ada  units:  TIMER,  ALARM,  COLLECTIONOFSENSORS. 
RECORDINGDEVICE.  and  MONlTOR_TEMPERATURES  into  two  subsets  Vi  and  Vi  such 
that 


(1)  36  £  I  w(v)  £  190 

v€V( 

where  (  I  w(v))/2  -  max  (w(v))  =  36 
veV  veV 

and  (  I  w(v))/2  +  max  (w(vl))  =  190 
v«V  veV 

(2)  l(Vi.  Vt).  the  interprocessor  communication  cost  is  near-minimal. 


The  complexities  of  the  units  are 


Unit  Source  line  of  code 


TIMER 

48 

ALARM 

30 

COLLECTION  OF  SENSORS 

53 

RECORDING  DEVICE 

19 

MONITOR  TEMPERATURES 

77 

17 


Figure  2.  Interunit  Communication 


We  sum  up  the  interunit  communications  and  produce  the  graph  representation  of  the 


program  as  shown  in  figure  3. 
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In  total,  there  are  sixteen  ways  to  partition  five  units.  Three  of  these  partitions  are  not 
qualified  as  they  fail  to  satisfy  the  balancing  constraints.  The  thirteen  possible 


partitions  and  their  corresponding  communication  costs  are 


T 


% 


V,  V,  l(V,.V,) 


T 

A.C.R.M 

26 

T.A 

C.R.M 

34 

T.R 

A.C.M 

45 

T.C.M 

A,R 

52 

T.A.R 

C.M 

53 

T.A.C.R 

M 

70 

T.A.R.M 

C 

81 

T.M 

A.C.R 

82 

T.A.C 

R.M 

82 

T.C.R 

A.M 

82 

T.C 

A.R.M 

87 

T.R.M 

A.C 

87 

T.A.M 

C.R 

88 

Where. 

T  =  TIMER 
A  =  ALARM 

C  =  COLLECTION_OF_SENSORS 
R  =  RECORDING_DEVICE 
M  =  MONITOR  TEMPERATURES 


We  obtain  an  initial  partition  by  assigning  units  to  Vi  or  Vi,  depending  which  has  less 
weight.  So  V,  =  {TIMER.  RECORD!NG_DEVICE.  MONITOR  I NG_TEM PER  ATURE}  and  V, 
=  {ALARM.  COLLECTlON_OF_SENSORS}.  This  initial  partition  has  a  cost  of  l(Vi,  Vi) 
=  87.  We  are  searching  for  the  first  interchange  that  will  yield  a  positive  reduction  in 
cost.  There  are  six  possible  pairwise  interchanges,  and  their  corresponding  gams  are 
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TIMER 


< - > 


COLLECTION  OF  SENSORS 


53 


RECORDING_DEVICE  < - >  COLLECTlON_OF_SENSORS  42 

MONlTOR_TEMPERATURES  <- — >  ALARM  34 

RECORDING_DEVlCE  < - >  ALARM  11 

MONlTOR_TEMPERATURES  < - >  COLLECTlON_OF_SENSORS  5 

TIMER  < - >  ALARM  0 

We  pick  the  pair  TIMER  and  COLLECTION_OF_SENSORS  for  interchange  to  give  a 
maximum  positive  reduction  in  cost  and  form  VL  and  Vj',  where  Vi  = 
{MONlTOR!NG_TEMPER  ATURE.  ALARM.  RECORDlNG_DEVlCE},  and  V,' 

{COLLECTlON_OF_SENSORS.  TIMER}.  Next,  we  exclude  ALARM  and  TIMER  from 
consideration  for  exchange.  There  are  then  two  pairs  of  units  to  be  considered: 
(MONITOR  TEMPERATURES.  ALARM)  and  (RECORDING_DEVICE.  ALARM).  Both  pairs 
y/eld  negative  gams.  We  stop  the  interchange  process.  The  final  partition  is  thus  {V,\ 
V,  }  which  has  an  interprocessor  communication  cost  of  34;  that  is  the  second  best 


partition. 


6.0  CONCLUSION 


In  this  paper,  we  have  discussed  the  problem  of  partitioning  real-time  Ada  software  for 
distributed  targets  and  adopted  source  code  allocation  as  an  approach  to  the  problem 
In  this  approach,  code  for  an  application  is  developed  and  tested  as  a  single  Ada 
program,  and  then  partitioned  and  distributed  to  distributed  targets,  where  compilation 
takes  place  at  each  location,  A  partitioning  methodology  for  Ada  programs  has  been 
outlined. 

The  example  we  presented  favorably  shows  the  effectiveness  of  the  partitioning 
methodology.  However,  its  performance  has  not  been  assessed  formally. 

We  have  used  a  simple  definition  of  load  balancing.  This  definition  is  acceptable  only 
for  0(n)  type  programs.  It  is  necessary  to  provide  a  more  accurate  measure  of  load 
balancing  for  other  types  of  programs. 

In  this  report,  we  have  considered  the  complexities  of  the  partitionable  units  and 
message  complexities  of  interunit  communications.  We  did  not  consider  parameters 
such  as  channel  capacities  nor  complexities  of  the  processors.  We  feel  that  there  is  a 
need  to  research  in  identifying  important  parameters  for  partitioning. 


APPENDIX  A 


with  TEXT  JO.  SYSTEM 
uti  TEXT  JO: 

procedure  MONITOR-TEMPERATURES  It 


typ*  COMMAND 


it  (DISABLE. 

RECORD-STATUS 


ENABLE. 

SET-LIMITS); 


typ*  SENSOR-NAME  it  (LOBBY.  MAIN-OFFICE.  WAREHOUSE. 

STOCK-ROOM.  TERMINAI _ ROOM.  LIBRARY. 

COMPUTER-ROOM.  LOUNGE.  LOADING-DOCK, 

CLEAN_ROOM). 

typ*  SENSOR-STATE  it  {DISABLED.  ENABLED): 
typ*  SENSOR-VALUE  it  delta  0  5  rang*  0  0  100  0: 


package  COMMANOJO  it  naw  ENUMERATION.  lO(COMMANO). 

ut*  COMMAND  JO: 

package  SENSOR-NAME  JO  it  naw  ENUMERATION_IO(SENSOR_NAME); 

ut*  SENSOR-NAME  JO: 

packagt  SENSOR- VALUE- 10  It  n*w  FIXED-lO(SENSOR. VALUE); 
ut*  SENSOR.  VALUE-IO. 


task  ALARM  it 

tntry  POST_FAULT_IN_SENSOR; 

entry  POST_OUT.OF_LIMITS(ON_S£NSOR  in  SENSOR-NAME) ; 
•nd  ALARM; 


tatk  RECOROING-DEVICE  it 
entry  LOG_THE_STATUS(OF.SENSOR 

WITH. VALUE 
WITH. STATE 

end  RECORDING-DEVICE; 

taak  COLLECTION-OF. SENSORS  it 
•ntry  DISABLE  (SENSOR 

entry  ENABLE  (SENSOR 

tntry  FORCE_RECORD(OF_SENSOR 
tntry  SET_THE_LIMITS(FOR_SENSOR 
LOW-LIMIT 
HIGH-LIMIT 

end  COLLECTION-OF-SENSORS. 

tatk  TIMER  it 
tntry  INTERRUPT; 
for  INTERRUPT  utt  at  16#8E# 

and  TIMER: 


HIGH.BOUNO 

LOW-BOUND 

NAME 

USER-COMMAND 

VALUE 


SENSOR-VALUE. 
SENSOR-VALUE. 
SENSOR-NAME. 
COMMAND. 
SENSOR. VALUE. 


In  SENSOR-NAME; 
In  SENSOR. VALUE, 
in  SENSOR. STATE); 


in  SENSOR-NAME): 
in  SENSOR-NAME), 
in  SENSOR-NAME). 
In  SENSOR-NAME, 
in  SENSOR-VALUE. 
In  SENSOR-VALUE). 


taak  body  ALARM  it  saparatt; 

tatk  body  RECOROING-DEVICE  it  separata; 

taak  body  COLLECTION.OF.SENSORS  it  separata; 
tatk  body  TIMER  '•  separate; 


loop 

begin  -  start  ot  a  local  block  with  exception  hander 
PUT('Enter  your  command"); 

GET(USER-COMMAND) ; 

NEW-LINE; 

PUT  _LINEf  Command  accepted"); 
cate  USER-COMMAND  is 
when  DISABLE  -> 

PUTfEnter  sensor  name:"); 

GET(NAME); 

NEW-LINE; 

COLLECTION-OF.SENSORS. DISABLE(SENSOR  =>  NAME). 
PUT.LINEfSensor  disabled"); 
when  ENABLE  =*> 

PUTfEnter  sensor  name:"); 

GET(NAME); 

NEW-LINE; 

COLLECTION-OF.SENSORS  ENABLE(SENSOR  =>  NAME); 
PUT-LINE("Sensor  enabled”); 
when  RECORD-STATUS  =  > 

PUTfEnter  sensor  name:’’); 

GET(NAME); 

NEW-LINE. 

COLLECTION-OF.SENSORS  FORCE-RECORD 
(OF.SENSOR  =>  NAME). 

PUT-UNE("Sensor  status  set”); 
when  SET.LIMITS  *  > 

PUTfEnter  sensor  name  ’); 

GET(NAME); 

NEW-LINE: 

PUT  ("Enter  lower  limit:  '); 

GET  (LOW-BOUND). 

NEW-LINE. 

PUT-LINEfLower  limit  accepted”); 

PUTfEnter  upper  limit:  ). 

GET(HIGH-BOUND): 

NEW-LINE; 

PUT.LINEf Upper  limit  accepted"). 
COLLECTION-OF.SENSORS  SET. THE.  LIMITS 
(FOR-SENSOR  =>  NAME. 

LOW-LIMIT  =>  LOW-BOUND. 

HIGH-LIMIT  =>  HIGH-BOUND). 
PUT-LINEfLimits  set”); 
end  case; 


exception 

when  DATA-ERROR  =  > 
PUT-LINEflllegal  entry  try  again"), 

end; 

end  loop; 

end  MONITOR-TEMPERATURES; 


Mparat«  (MONITOR-TEMPERATURES) 

task  body  TIMER  la 
MINUTES  constant  =  1 

typo  INTERVAL  it  rang#  0  15. 

ticks  INTERVAL  =  0 

bagin 

loop 

accapt  INTERRUPT  do 
TICKS  =  TICKS  ‘  1 . 
if  TICKS  *  15  •  MINUTES  than 
for  l  in  SENSOR-NAME 
loop 

COLLECTION.OF-SENSORS  FORCE_RECORD(OF-SENSOR 

or 

dalay  5  0 

ALARM. POST_FAULT_IN_SENSOR 

and  salad; 
and  loop; 

TICKS  =  0 
and  if; 

and  INTERRUPT, 
and  loop; 
and  TIMER. 


with  SYSTEM: 

separate  (MONITOR-TEMPERATURES) 
task  body  ALARM  la 

SITS  constant  =  1. 

WOROS  constant  :■  16  •  BITS; 

typo  LIGHT  la  (OFF.  ON); 

for  UGHT'SIZE  use  1  •  WORDS; 

for  LIGHT  use  (OFF  =>  16#OOOC#,  ON  =>  16#FFFF#): 

FAULT-LIGHT  LIGHT  =  OFF 

for  FAULT-LIGHT  use  at  16#0O10#; 

typo  LIMIT-CHECK  la  array  (SENSOR-NAME)  of  LIGHT 

for  LIMIT-CHECK  SIZE  uso  (SENSOR-NAME  POS(SENSOR.NAME  LAST) 

*  1)  •  WORDS; 

OUT_OF_LIMITS_LIGHT  LIMIT-CHECK  =  LIMIT-CHECK  (  othors  *>  OFF); 
for  OUT-OF-  LI  MITS-  LIGHT  uso  at  1 6#®01 1  # ; 

begin 

loop 

select 

accept  POST-FAULT  .IN-SENSOR  do 
FAULT-LIGHT  =  ON; 
end  POST-FAULTJN-SENSOR; 

or 

accept  POST-OUT.OF. LIMITS(ON_SENSOR  In  SENSOR-NAME)  do 
OUT-OF-UMITS-LIGHT(ON-SENSOR)  =  ON; 
end  POST-OUT-OF, LIMITS. 

end  select; 
end  loop; 
end  ALARM: 


V'yC 


wltb  DEVICE_IO: 

saparata  (MONITOR-TEMPERATURES) 
task  body  RECORDING-DEVICE  la 

bagln 

loop 

accapt  LOG_THE_STATUS(OF_SENSOR 

WITH-VALUE 
WITH, STATE 

DEVICE_IO  PUT(OF-SENSOR); 
DEVICE-IO. PUT(WITH_  VALUE); 
DEVICEJO.  PUT  ( WITH-STATE) ; 
and  LOG-THE-STATUS; 

and  loop; 


In  SENSOR_NAME; 

In  SENSOR_VALUE; 

:  In  SENSOR-STATE)  do 


and  RECORDING-DEVICE; 


wftfl  SET-PACKAGE.  SYSTEM; 

Mpinti  (MONITOR-TEMPERATURES) 
task  body  COLLECTION-OF.SENSORS  1* 

SITS  constant  .  *  1 : 

WORDS  ;  constant  =  16  •  SITS 

typo  SENSOR-RECORD  Is  record 
HIGH_LIMIT  SENSOR. VALUE  =  SENSOR- VALUE  LAST. 

LOW-LIMIT  SENSOR-VALUE  =  SENSOR- VALUE  FIRST. 

VALUE  ;  SENSOR-VALUE  =  SENSOR- VALUE'FIRST, 
and  racord; 

type  SENSOR-GROUP  Is  array  (SENSOR-NAME)  ot  SENSOR-RECORD: 

SENSOR  SENSOR-GROUP: 

package  SENSOR_SET  Is  new  SET- PACKAGE(UNIVERSE  =>  SENSOR_NAME); 
use  SENSOR-SET: 

ACTIVE-SENSORS  SET  =  NULL-SET; 

type  SENSOR-PORT  Is  range  0  .  (2  ••  WORDS  -  1): 
tor  SENSOR-PORT  SIZE  use  1  •  WOROS; 

type  SENSOR-LIST  Is  array  (SENSOR-NAME)  of  SENSOR_PORT; 

lor  SENSOR-LIST  SIZE  use  (SENSOR-NAME  POS(SENSOR_NAM£  LAST)  -  1) 

•  WORDS; 

SENSOR_MAP  SENSOR.  LIST, 

for  SENSOR-MAP  use  at  16#0100#; 

begin 

loop 

select 

accept  DISA8LE(SENS0R  in  SENSOR-NAME)  do 
ACTIVE-SENSORS  =  ACTIVE-SENSORS  -  SENSOR; 
end  DISABLE: 
or 

accept  ENABLEISENSOR  in  SENSOR-NAME)  do 
ACTIVE- SENSORS  =  ACTIVE-SENSORS  -  SENSOR; 
end  ENABLE: 
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accept  PORCE_RECORD(OF. SENSOR  in  SENSOR  .NAMEi  do 
if  IS_AL_MEMBER(OF.SENSOR.  OF.SET  =  >  ACTIVE. SENSORS)  than 
RECORDING-DEVICE  LOG.THE.STATUSlOF.SENSOR 

SENSORfOF. SENSOR)  VALUE 
WITH.STATE  =  -  ENABLED) 


•Isa 

RECORDING-DEVICE  LOG. THE. STATUSf OF. SENSOR. 

WITH.VALUE  =>  SENSOR-VALUE  FIRST 
WITH.STATE  =>  DISABLED); 

and  if; 

and  FORCE-RECORD: 

or 

accapt  SET-THE-LIMITS(FOR_SENSOR  in  SENSOR-NAME; 

LOW-LIMIT  in  SENSOR- VALUE. 

HIGH.UMIT  in  SENSOR-VALUE)  do 

SENSOR(FOR-SENSOR)  LOW-LIMIT  =  LOW.LIMIT; 
SENSOR(FOR.SENSOR)  HIGH.UMIT  =  HIGH-LIMIT, 
and  SET-THE.LIMITS 
alsa 

for  I  in  SENSOR_NAME 

loop 

if  IS_A_MEM8ER(I.  OF.SET  =>  ACTIVE-SENSORS)  than 
SENSOR(I)  VALUE  = 

(SENSOR-MAP(I)  •  SENSOR- VALUEI0  5)). 
if  (SENSOR(I). VALUE  <  SENSOR(I)  LOW-LIMIT)  or 
ISENSOR(I)  VALUE  >  SENSOR(I). HIGH.UMIT)  than 
ALARM. POST-OUT.OF.LIMITS(I); 
and  if; 
and  if  ; 
and  loop  ; 
and  salact  ; 
and  loop; 

and  COLLECTION-OF. SENSORS, 
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